Information retrieval model based on graph comparison

نویسندگان

  • Quoc-Dinh Truong
  • Taoufiq Dkaki
  • Josiane Mothe
  • Pierre-Jean Charrel
چکیده

We propose a new method for Information Retrieval (IR) based on graph vertices comparison. The main goal of this method is to enhance the core IR-process of finding relevant documents in a collection of documents according to a user’s needs. The method we propose is based on graph comparison and involves recursive computation of similarity. In the framework this approach, documents, queries and indexing terms are viewed as vertices of a bipartite graph where edges go from a document or a query – first node typeto an indexing term – second node type-. Edges reflect the link that exists between documents or queries on the one hand and indexing terms on the other hand. In our model, graph edge settings reflect the tf-ifd paradigm. The proposed similarity measure instantiates and extends this principle, stipulating that the resemblance of two items or objects can be computed using the similarities of the items to which they are related. Our method also takes into account the concept of similarity propagation over graph edges. Experiments conducted using four small sized IR test collections (TREC 2004 Novelty Track, CISI, Cranfield & Medline) demonstrate the effectiveness of our approach and its feasibility as long as the graph size does not exceed a few thousand nodes. The experiment’s results show that our method outperforms the vector-based model. Our method actually highly outperforms the vector-based cosine model, sometimes by more than doubling the precision, up to the top sixty returned documents. The computational complexity issue is resituated in the context of MAC-FAC approaches – many are called but few are chosen. More precisely, we suggest that our method can be successfully used as a FAC stage combined with a fast and computationally cheap method used as a MAC stage.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Factors Affecting Student's Scientific Information Retrieval based on Fuzzy Logic Method Compared to Traditional Method

Background and aim: The aim of this study was to identify the factors affecting on students' performance in information retrieval based on fuzzy logic method compared to traditional method. Materials and methods: This survey-descriptive study was performed using quantitative approach. The research population was 34 PhD students, and the researcher-made questionnaire was used. Data were analyzed...

متن کامل

Assessing the Internal Structure of the Ellis Information Retrieval Model in Order to Present the Persian Norm of Web Retrieval Tools

Introduction: Study evaluated the internal structure of Ellis information seeking model in the student community with the aim of presenting the Persian norm. Methods: This is a descriptive-analytical study conducted by cross-sectional survey method in the second semester of the academic year 1399-1400. Population comprise of 280 graduate students at Ahvaz Jundishapur University of Medical Scien...

متن کامل

Analysis of the Therapists’ Information Behavior in the diagnosis and treatment of mental disorders based on Kuhlthau's information retrieval process model

Background and Aim: Under the influence of various factors, people use different methods and methods to obtain information and express different information behaviors. These behaviors have been introduced in the form of patterns and models of information retrieval by information science experts in recent decades, which can be used in various fields. One of these areas that almost all people are...

متن کامل

An Effective Path-aware Approach for Keyword Search over Data Graphs

Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, w...

متن کامل

Private Key based query on encrypted data

Nowadays, users of information systems have inclination to use a central server to decrease data transferring and maintenance costs. Since such a system is not so trustworthy, users' data usually upkeeps encrypted. However, encryption is not a nostrum for security problems and cannot guarantee the data security. In other words, there are some techniques that can endanger security of encrypted d...

متن کامل

GVC: a graph-based Information Retrieval Model

GVC is a new information retrieval model that is based on Graph Vertices Comparison (GVC). It implements a new similarity measure to compare documents and users' queries based on graph matching. In this model, graphs are composed of two types of nodes. Documents, queries and indexing terms are viewed as vertices of this bipartite graph where each edge goes from a document or a query –first type...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008